# **The Price of Uncertainty in Present-Biased Planning**

Susanne Albers and Dennis Kraft(B)

Department of Computer Science, Technical University of Munich, Munich, Germany {albers,kraftd}@in.tum.de

**Abstract.** The tendency to overestimate immediate utility is a common cognitive bias. As a result people behave inconsistently over time and fail to reach long-term goals. Behavioral economics tries to help affected individuals by implementing external incentives. However, designing robust incentives is often difficult due to imperfect knowledge of the parameter <sup>β</sup> <sup>∈</sup> (0, 1] quantifying a person's present bias. Using the graphical model of Kleinberg and Oren [8], we approach this problem from an algorithmic perspective. Based on the assumption that the only information about <sup>β</sup> is its membership in some set <sup>B</sup> <sup>⊂</sup> (0, 1], we distinguish between two models of uncertainty: one in which β is fixed and one in which it varies over time. As our main result we show that the conceptual loss of efficiency incurred by incentives in the form of penalty fees is at most 2 in the former and 1 + max B/ min B in the latter model. We also give asymptotically matching lower bounds and approximation algorithms.

**Keywords:** Approximation algorithms · Behavioral economics Heterogeneous agents · Incentive design · Penalty fees Variable present bias

### **1 Introduction**

Many goals in life such as losing weight, passing an exam or paying off a loan require long-term planning. But while some people stick to their plans, others lack self-control; they eat unhealthy food, delay their studies and take out new loans. In behavioral economics the tendency to change a plan for no apparent reason is known as *time-inconsistent behavior*. The questions are, what causes these inconsistencies and why do they affect some more than others? A common explanation is that people make present biased decisions, i.e., they assign disproportionately greater value to the present than to the future. In this simplifying model a person's behavior is the mere result of her present bias and the setting in which she is placed. However, the interplay between these two factors is intricate and sometimes counter-intuitive as the following example demonstrates:

Consider two runners Alice and Bob who have two weeks to prepare for an important race. Each week they must choose between two types of workout.

Work supported by the European Research Council, Grant Agreement No. 691672. c The Author(s) 2017

N. R. Devanur and P. Lu (Eds.): WINE 2017, LNCS 10674, pp. 325–339, 2017. https://doi.org/10.1007/978-3-319-71924-5\_23

Type A always incurs an effort of 1, whereas type B incurs an effort of 3 in the first and 9 in the second week. Since A offers less preparation than B, Alice and Bob's effort in the final race is 13 if they consistently choose A and 1 if they consistently choose B. Furthermore, A and B are incompatible in the sense that switching between the two will result in an effort of 16 in the final race. Figure 1 models this setting as a directed acyclic graph G with terminal nodes s and t. The intermediate nodes v<sup>X</sup> and vXY represent a person's state after completing the workouts X, Y ∈ {A, B}. To move forward with the training, Alice and Bob must perform the tasks associated with the edges of G, i.e., complete workouts and run the race. Looking at G it becomes clear that two consecutive workouts of type B are the most efficient routine in the long run. However, this is not necessarily the routine a present biased person will choose.

For instance, assume that Alice and Bob discount future costs by a factor of <sup>a</sup> = 1/2−<sup>ε</sup> and <sup>b</sup> = 1/2+<sup>ε</sup> respectively. We call <sup>a</sup> and <sup>b</sup> their present bias. At the beginning of the first week Alice and Bob compare different workout routines. From Alice's perspective two workouts of type A are strictly more preferable to two workouts of type <sup>B</sup> as she anticipates an effort of 1 + <sup>a</sup>(1 + 13) = 8 <sup>−</sup> <sup>14</sup><sup>ε</sup> for the former and 3 + <sup>a</sup>(9 + 1) = 8 <sup>−</sup> <sup>10</sup><sup>ε</sup> for the latter. A similar calculation for Bob shows that he prefers two workouts of type B. Considering that neither Alice nor Bob finds a mix of A and B particularly interesting at this point, we conclude that Alice chooses A in the first week and Bob B. However, come next week, Bob expects an effort of 1 + b16 = 8 + 16ε for A and 9 + b = 19/2 + ε for B. Assuming ε is small enough, A suddenly becomes Bob's preferred option and he switches routines. Alice on the other hand has no reason to change her mind and sticks to A. As a result she pays much less than Bob during practice and in the final race. This is remarkable considering that her present bias is only marginally different from Bob's. Moreover, it seems surprising that only Bob behaves inconsistently, although he is less biased than Alice.

#### **1.1 Related Work**

Traditional economics and game theory are based on the assumption that people maximize their utility in a rational way. But despite their prevalence, these assumptions disregard psychological aspects of human decision making observed in empirical and experimental research [5]. For instance, time-inconsistent behavior such as procrastination seems paradox in the light of traditional economics. Nevertheless, it can be explained readily by a tendency to overestimate immediate utility in long-term planning, see e.g. [13]. By studying such cognitive biases, behavioral economics tries to obtain more realistic economic models.

A significant amount of research in this field has been devoted to *temporal discounting* in general and *quasi-hyperbolic discounting* in particular, see [6] for a survey. The quasi-hyperbolic discounting model proposed by Laibson [11] is characterized by two parameters: the *present bias* <sup>β</sup> <sup>∈</sup> (0, 1] and the *exponential discount rate* <sup>δ</sup> <sup>∈</sup> (0, 1]. People who plan according to this model have an accurate perception of the present, but scale down any costs and rewards realized <sup>t</sup> <sup>≥</sup> <sup>1</sup> time units in the future by a factor of βδ<sup>t</sup> . To keep our work clearly delineated in scope, we adopt Akerlof's model of quasi-hyperbolic discounting [1] and make

**Fig. 1.** Task graph of the running scenario

the following two assumptions: First, we focus on the present bias β and set the exponential discount rate to δ = 1. Secondly, we assume people to be *naive* in the sense that they are unaware of their present bias and only optimize their current perceived utility when making a decision. Note that Alice and Bob from the previous example behave like agents in Akerlof's model for a present bias of <sup>β</sup> = 1/<sup>2</sup> <sup>−</sup> <sup>ε</sup> and <sup>β</sup> = 1/2 + <sup>ε</sup> respectively.

Until recently the economic literature lacked a unifying and expressive framework for analyzing time-inconsistent behavior in complex social and economic settings. Kleinberg and Oren closed this gap by modeling the behavior of naively present biased individuals as a planning problem in task graphs like the one depicted in Fig. 1 [8]. We introduce this framework formally in Sect. 2. As a result of Kleinberg and Oren's work, an active line of research at the intersection of computer science and behavioral economics has emerged. For instance, the graphical model has been used to systematically analyze different types of quasi-hyperbolic discounting agents such as *sophisticated* agents who are fully or partially aware of their present bias [9] and agents whose present bias varies randomly over time [7]. Furthermore, the graphical model was used to shed light on the interplay between temporal biases and other types of cognitive biases [10].

The graphical model is of particular interest to us as it provides a natural framework for a design problem frequently encountered in behavioral economics. Given a certain social or economic setting, the problem is to improve a timeinconsistent person's performance via various sorts of *incentives*, such as monetary rewards, deadlines or penalty fees, see e.g. [12]. Using the graphical model, Kleinberg and Oren demonstrate how a strategic choice reduction can incentivize people to reach predefined goals [8]. To implement their incentives, they simply remove the corresponding edges from the task graph. However, there is a computational drawback to this approach. As we have shown in previous work, an optimal set of edges to remove from a task graph with n nodes is NP-hard to approximate within a factor less than √n/3 [2]. A more general form of incentives avoiding these harsh complexity theoretic limitations are penalty fees. In the graphical model penalty fees are at least as powerful as choice reduction and admit a polynomial time 2-approximation [3].

#### **1.2 Incentive Design for an Uncertain Present Bias**

Frederick, Loewenstein and O'Donoghue have surveyed several attempts to estimate people's temporal discount functions [6]. But as estimates differ widely across studies and individuals, the difficulty of predicting a person's temporal discount function becomes apparent. Clearly, this poses a serious challenge for the design of reliable incentives. After all, Alice and Bob's scenario demonstrates how arbitrarily small changes in the present bias can cause significant changes in a person's behavior. In this work we address the effects of incomplete information about a person's present bias in two different notions of uncertainty.

In Sect. 3 we consider naive individuals whose exponential discount rate is δ = 1, but whose present bias β is unknown. The only prior information we have about β is its membership in some larger set B. Our goal is to construct incentives that are robust with respect to the uncertainty induced by B. More precisely, we are interested in incentives that work well for any present bias contained in B. An alternative perspective is that we try to construct incentives which are not limited to a single person, but serve an entire population of individuals with different present bias values. A simple instance of this problem in which a single task must be partitioned and stretched over a longer period of time has been studied by Kleinberg and Oren [8]. But like most research on incentivizing *heterogeneous* populations, see e.g. [12], Kleinberg and Oren's results are restricted to a very specific setting. They themselves suggest the design of more general incentives as a major research direction for the graphical framework [8].

Using penalty fees as our incentive of choice and a fixed reward to keep people motivated, we present the first results in this area. Our contribution is twofold. On the one hand, we try to quantify the conceptual loss of efficiency caused by incomplete knowledge of β. For this purpose we introduce a novel concept called *price of uncertainty*, which denotes the smallest ratio between the reward required by an incentive that accommodates all <sup>β</sup> <sup>∈</sup> <sup>B</sup> and the reward required by an incentive designed for a specific <sup>β</sup> <sup>∈</sup> <sup>B</sup>. We present an elegant algorithmic argument to prove that the price of uncertainty is at most 2. Remarkably, this bound holds true independent of the underlying graph G and present bias set B. To complement our result, we construct a family of graphs G and present bias sets B for which the price of uncertainty converges to a value strictly greater than 1. On the other hand, we consider the computational problem of constructing penalty fees that work for all <sup>β</sup> <sup>∈</sup> <sup>B</sup>, but require as little reward as possible. Drawing on the same algorithmic ideas we used to bound the price of uncertainty yields a polynomial time 2-approximation. Furthermore, we present a non-trivial proof to show that the decision version of the problem is contained in NP. Since all hardness results of [3] also apply under uncertainty, we know that there is no 1.08192-approximation unless P = NP.

### **1.3 Incentive Design for a Variable Present Bias**

In Sect. 4 we generalize our notion of uncertainty to individuals whose present bias β may change arbitrarily over time within the set B. This model is inspired by work of Gravin et al. [7], except that we do not rely on the assumption that β is drawn independently from a fixed probability distribution. Instead, our goal is to design penalty fees that work well for all possible sequences of β over time. We believe this to be an interesting extension of the fixed parameter case as the variability of β may capture changes in a person's temporal discount function caused by unforeseen cognitive biases different from her present bias. As a result we obtain more robust penalty fees.

Again, our contribution is twofold. On the one hand, we introduce the *price of variability* to quantify the conceptual loss of efficiency caused by unpredictable changes in β. Similar to the price of uncertainty, we define this quantity to be the smallest ratio between the reward required by an incentive that accommodates all possible changes of <sup>β</sup> <sup>∈</sup> <sup>B</sup> over time and the reward required by an incentive designed for a specific and fixed <sup>β</sup> <sup>∈</sup> <sup>B</sup>. However, unlike the price of uncertainty, the price of variability has no constant upper bound. Instead, the ratio seems closely related to the *range* τ = max B/ min B of the set B. By generalizing our algorithm from Sect. 3 we obtain an upper bound of 1+τ for the price of variability. To complement this result, we construct a family of graphs G for which the price of variability converges to τ /2. On the other hand, we consider the computational aspects of constructing penalty fees for a variable β. As a result of the unbounded price of variability, we are not able to come up with a constant polynomial time approximation. Instead, we obtain a (1 + τ )-approximation. However, by using a sophisticated reduction from VECTOR SCHEDULING, we prove that no efficient constant approximation is possible unless NP = ZPP. We conclude our work by studying a curious special case of variability in which individuals may temporarily lose their present bias. For this scenario, which is characterized by the assumption that 1 <sup>∈</sup> <sup>B</sup>, optimal penalty fees can be computed in polynomial time.

### **2 The Model**

In the following we introduce Kleinberg and Oren's graphical framework [8]. Let G = (V,E) be a directed acyclic graph with n nodes that models some long-term project. The start and end states are denoted by the terminal nodes s and t. Furthermore, each edge e of G corresponds to a specific task whose inured effort is captured by a non-negative cost c(e). To finish the project, a present biased agent must sequentially complete all tasks along a path from s to t. However, instead of following a fixed path, the agent constructs her path dynamically according to the following simple procedure:

When located at any node v different from t, the agent tries to evaluate the minimum cost she needs to pay in order to reach t. For this purpose she considers all outgoing edges (v, w) of her current position v. Because the tasks associated with these edges must be performed immediately, the agent assesses their cost correctly. In contrast, all future tasks, i.e., tasks on a path from v to <sup>t</sup> not incident to <sup>v</sup>, are discounted by her present bias of <sup>β</sup> <sup>∈</sup> (0, 1]. As a result, we define her *perceived cost* for taking (v, w) to be dβ(v, w) = c(v, w) + βd(w), where d(w) denotes the cost of a cheapest path from w to t. Furthermore, we define <sup>d</sup>β(v) = min{c(v, w) + βd(w) <sup>|</sup> (v, w) <sup>∈</sup> <sup>E</sup>} to be the agent's *minimum perceived cost* at v. Since the agent is oblivious to her own present bias, she only traverses edges (v, w) for which dβ(v, w) = dβ(v). Ties are broken arbitrarily. Once the agent reaches the next node, she reiterates this process.

To motivate the agent, a non-negative reward r is placed at t. Because the agent must reach t before she can collect r, her *perceived reward* for reaching t is βr at each node different from <sup>t</sup>. When located at <sup>v</sup> <sup>=</sup> <sup>t</sup>, the agent is only motivated to proceed if <sup>d</sup>β(v) <sup>≤</sup> βr. Otherwise, if <sup>d</sup>β(v) > βr, she quits. We say that G is *motivating*, if she does not quit while constructing her path from s to t. Note that sometimes the agent can construct more than one path from s to t due to ties in the perceived cost of incident edges. In this case, G is considered motivating if she does not quit on any such path.

For the sake of a clear presentation, we will assume throughout this work that each node of G is located on a path from s to t. This assumption is sensible because the agent can only visit nodes reachable from s. Furthermore, she is not willing to enter nodes that do not lead to the reward at t. Consequently, only nodes that are on a path from s to t are relevant to her behavior. All nodes not satisfying this property can be removed from G in a simple preprocessing step.

### **2.1 Alice and Bob's Scenario**

To illustrate the model, we revisit Alice and Bob's scenario. The task graph G is depicted in Fig. 1. Remember that <sup>a</sup> = 1/2−<sup>ε</sup> and <sup>b</sup> = 1/2 +<sup>ε</sup> denote Alice and Bob's respective present bias. For convenience let 0 < ε <sup>≤</sup> <sup>1</sup>/54. Furthermore, assume that a reward of r = 27 is awarded upon reaching t.

We proceed to analyze Alice and Bob's walk through G. At their initial position s they must decide whether they move to v<sup>A</sup> or vB. For this purpose they try to find a path that minimizes the perceived cost. As the more present biased person, Alice's favorite path is s, vA, vAA, t with a perceived cost of <sup>d</sup>a(s) = <sup>d</sup>a(s, vA)=8−14ε. By choice of <sup>ε</sup> this cost is covered by her perceived reward ar = 27/<sup>2</sup> <sup>−</sup> <sup>27</sup>ε. Consequently, she is motivated to traverse the first edge and moves to vA. A similar argument shows that Bob moves to vB. Once they reach their new nodes, Alice and Bob reevaluate plans. From Alice's perspective vA, vAA, t is still the cheapest path to t. Bob, however, suddenly prefers vB, vAB, t to his original plan. Nevertheless, both of their perceived cost remains covered by their perceived reward and they move to vAA and vAB respectively. At this point the only option is to take the direct edge to t. For Alice the perceived cost at vAA is sufficiently small to let her reach t. In contrast, Bob's perceived cost of db(vAB) = 16 exceeds his perceived reward of br = 27/2 + 27ε and he quits.

### **2.2 Cost Configurations**

Bob's behavior in the previous example demonstrates how present biased decisions can deter people from reaching predefined goals. To ensure an agent's success it is therefore sometimes necessary to implement external incentives such as penalty fees. In the graphical model, penalty fees allow us to arbitrarily raise the cost of edges in G. More formally, let ˜c be a so called *cost configuration*, which assigns a non-negative extra cost ˜c(e) to all edges e of G. The result is a new task graph Gc˜, whose edges e have a cost of c(e)+˜c(e). A present biased agent navigates through Gc˜ according to the same rules applying in G. We say that ˜c is motivating if and only if Gc˜ is. To avoid ambiguity we annotate our notation whenever we consider a specific ˜c, e.g., we write dc˜ and dβ,c˜ instead of d and dβ.

We conclude this section with a brief demonstration of the positive effects penalty fees can have in Alice and Bob's scenario. Let ˜c be a cost configuration that assigns an extra cost of ˜c(vB, vAB)=1/2 to (vB, vAB) and ˜c(e) = 0 to all other edges <sup>e</sup> = (vB, vAB). Note that <sup>G</sup> and <sup>G</sup>c˜ are identical task graphs except for the cost of (vB, vAB). Because Alice does not plan to take (vB, vAB) on her way through G and has even less reason to do so in Gc˜, we know that ˜c does not affect her behavior. For similar reasons, ˜c does not affect Bob's choice to move to vB. However, once Bob has reached v<sup>B</sup> his perceived cost of the path vB, vAB, t is db,c˜(vB, vAB) = 19/2 + 16ε, whereas his perceived cost of vB, vBB, t is only db,c˜(vB, vBB) = 19/2 + ε. Since the latter option appears to be cheaper and is covered by his perceived reward, Bob proceeds to vBB and then onward to t. As a result ˜c yields a task graph that is motivating for Alice and Bob alike. This is a considerable improvement to the original task graph.

## **3 Uncertain Present Bias**

In this section we consider agents whose present bias β is uncertain in the sense that our only information about <sup>β</sup> is its membership in some set <sup>B</sup> <sup>⊂</sup> (0, 1]. We call B the *present bias set*. For technical reasons we assume that B can be expressed as the union of constantly many closed subintervals from the set (0, 1]. This way the intersection of B with a closed interval is either empty or contains an efficiently computable minimal and maximal element. To measure the degree of uncertainty induced by B, we define the range of B as τ = max B/ min B.

### **3.1 A Decision Problem**

Our goal is to construct a cost configuration ˜<sup>c</sup> that is motivating for all <sup>β</sup> <sup>∈</sup> <sup>B</sup>, but requires as little reward as possible. To assess the complexity of this task, let UNCERTAIN PRESENT BIAS (UPB) be the following decision problem:

**Definition 1 (UPB).** *Given a task graph* G*, present bias set* B *and reward* r > <sup>0</sup>*, decide whether a cost configuration* <sup>c</sup>˜ *motivating for all* <sup>β</sup> <sup>∈</sup> <sup>B</sup> *exists.*

If τ = 1, i.e., B only contains a single present bias parameter, UPB is identical to the decision problem MOTIVATING COST CONFIGURATION (MCC) studied in [3]. Since MCC is NP-complete, UPB must be NP-hard. But unlike MCC it is not immediately clear if UPB is also contained in NP. The reason is that proving MCC ∈ NP only requires to verify whether a given cost configuration is motivating for a single value of β; a property that can be checked in polynomial time [2]. However, proving UPB ∈ NP requires to verify whether a given cost configuration is motivating for all <sup>β</sup> <sup>∈</sup> <sup>B</sup>. Taking into account that <sup>B</sup> may very well be an infinite set, it becomes clear that we cannot check all values of β individually. Interestingly, we do not have to; checking a finite subset <sup>B</sup> <sup>⊆</sup> <sup>B</sup> of size <sup>O</sup>(n<sup>2</sup>) turns out to be sufficient.

**Proposition 1.** *For any task graph* G*, reward* r *and present bias set* B *a finite subset* <sup>B</sup> <sup>⊆</sup> <sup>B</sup> *of size* <sup>O</sup>(n<sup>2</sup>) *exists such that* <sup>G</sup> *is motivating for all* <sup>β</sup> <sup>∈</sup> <sup>B</sup> *if it is motivating for all* <sup>β</sup> <sup>∈</sup> <sup>B</sup> *.*

The above proposition is related to a theorem by Kleinberg and Oren, which bounds the number of paths an agent takes as <sup>β</sup> varies over (0, 1] by <sup>O</sup>(n<sup>2</sup>) [8]. Kleinberg and Oren's argument does not only establish existence of B , but also yields a polynomial time algorithm to construct B , which in turn implies that UPB ∈ NP. Due to space constraints, we refer to the full version of this paper for a corresponding proof of Proposition 1 as well as all other omitted proofs.

**Corollary 1.** *UPB is NP-complete.*

### **3.2 The Price of Uncertainty**

Since UPB is NP-complete, it makes sense to consider the corresponding optimization problem UPB-OPT. For this purpose, let r(G, B) be the infimum over all rewards admitting a cost configuration motivating for all <sup>β</sup> <sup>∈</sup> <sup>B</sup> and define:

**Definition 2 (UPB-OPT).** *Given a task graph* G *and present bias set* B*, determine* r(G, B)*.*

Clearly, UPB-OPT must be at least as hard as the optimization version of MCC. Consequently, we know that UPB has no PTAS and is NP-hard to approximate within a ratio less than 1.08192 [3]. But does the transition from a certain to an uncertain β reduce approximability?

Setting complexity theoretic considerations aside for a moment, an even more general question arises: How does the transition from a certain to an uncertain β affect the efficiency of cost configurations assuming unlimited computational resources? To quantify this conceptual difference in efficiency, we look at the smallest ratio between optimal cost configurations motivating for all <sup>β</sup> <sup>∈</sup> <sup>B</sup> and optimal cost configurations motivating for a specific <sup>β</sup> <sup>∈</sup> <sup>B</sup>. We call this ratio the *price of uncertainty*.

**Definition 3 (Price of Uncertainty).** *Given a task graph* G *and a present bias set* <sup>B</sup>*, the price of uncertainty is defined as* <sup>r</sup>(G, B)/ sup{r(G, {β}) <sup>|</sup> <sup>β</sup> <sup>∈</sup> <sup>B</sup>}*.*

Let us illustrate the price of uncertainty by going back to Alice and Bob's scenario and assume that <sup>B</sup> <sup>=</sup> {a, b} with <sup>a</sup> = 1/2−<sup>ε</sup> and 1/2+ε. In other words, the agent either behaves like Alice or she behaves like Bob, but we do not know which. It is easy to see that in either case the agent minimizes her maximum perceived cost on the way from s to t by taking the path P = s, vB, vBB, t. This minmax cost, which is either <sup>d</sup>a(vB, vBB) = 19/2−<sup>ε</sup> or <sup>d</sup>b(vB, vBB) = 19/2+ε, provides two lower bounds for the necessary reward when divided by the respective present bias. More formally, it holds true that <sup>r</sup>(G, {a}) <sup>≥</sup> (19/<sup>2</sup> <sup>−</sup> <sup>ε</sup>)/(1/<sup>2</sup> <sup>−</sup> <sup>ε</sup>) and <sup>r</sup>(G, {b}) <sup>≥</sup> (19/2 + <sup>ε</sup>)/(1/2 + <sup>ε</sup>). However, as we have seen in Sect. 2, neither Alice nor Bob are willing to follow P without external incentives. To discourage the agent from leaving P, we assign an extra cost of ˜c(s, vA)=5ε to **Algorithm 1.** UncertainPresentBiasApprox

 <sup>b</sup> <sup>←</sup> min <sup>B</sup>; <sup>P</sup> <sup>←</sup> minmax path from <sup>s</sup> to <sup>t</sup> w.r.t <sup>d</sup>*b*(e); <sup>α</sup> <sup>←</sup> max{d*b*(e) <sup>|</sup> <sup>e</sup> <sup>∈</sup> <sup>P</sup>}; **foreach** <sup>v</sup> <sup>∈</sup> <sup>V</sup> \ {t} **do** <sup>ς</sup>(v) <sup>←</sup> successor of <sup>v</sup> on a cheapest path from <sup>v</sup> to <sup>t</sup>; <sup>T</sup> <sup>=</sup> {(v, ς(v)) <sup>|</sup> <sup>v</sup> <sup>∈</sup> <sup>V</sup> \ {t}}; **foreach** <sup>e</sup> <sup>∈</sup> <sup>E</sup> **do** <sup>c</sup>˜(e) <sup>←</sup> 0; **foreach** <sup>e</sup> <sup>∈</sup> <sup>E</sup> \ (<sup>P</sup> <sup>∪</sup> <sup>T</sup>) **do** <sup>c</sup>˜(e) <sup>←</sup> <sup>2</sup>α/b + 1; **foreach** (v, w) <sup>∈</sup> <sup>T</sup> **such that** <sup>v</sup> <sup>∈</sup> <sup>P</sup> **and** w /<sup>∈</sup> <sup>P</sup> **do <sup>7</sup>** P- <sup>←</sup> v, ς(v), ς(ς(v)),...,t; <sup>u</sup> <sup>←</sup> first node of <sup>P</sup> different from v that is also a node of P; <sup>c</sup>˜(v, w) <sup>←</sup> cost of most expensive edge of <sup>P</sup> between v and u; **return** c˜;

(s, vA), ˜c(vB, vAB)=1/2 + 16ε to (vB, vAB) and ˜c(e) = 0 otherwise. This extra cost does not affect the agent's maximum perceived cost along P, which she still experiences at (vB, vBB). As a result, our bounds for <sup>r</sup>(G, {a}) and <sup>r</sup>(G, {b}) are tight and we get sup{r(G, {β}) <sup>|</sup> <sup>β</sup> <sup>∈</sup> <sup>B</sup>} <sup>=</sup> <sup>r</sup>(G, {a}). Moreover, because we have used the same cost configuration ˜<sup>c</sup> to derive <sup>r</sup>(G, {a}) and <sup>r</sup>(G, {b}), it must hold true that <sup>r</sup>(G, B) = sup{r(G, {β}) <sup>|</sup> <sup>β</sup> <sup>∈</sup> <sup>B</sup>}, implying that the price of uncertainty in Alice and Bob's scenario is 1.

#### **3.3 Bounding the Price of Uncertainty**

As Alice and Bob's scenario demonstrates, cost configurations designed for an uncertain β are not necessarily less efficient than those designed for a specific β. Therefore one might wonder whether scenarios exist in which a real loss of efficiency is bound to occur, i.e., can the price of uncertainty be greater than 1? The following proposition shows that such scenarios indeed exist.

**Proposition 2.** *There exists a family of task graphs and present bias sets for which the price of uncertainty converges to* 1.1*.*

As the price of uncertainty can be strictly greater than 1, the question for an upper bound arises. Ideally, we would like to design a cost configuration ˜c motivating for all <sup>β</sup> <sup>∈</sup> <sup>B</sup> assuming the reward is set to r(G, {b}) for some constant factor > 1 and b = min B. Clearly, the existence of such a ˜c would imply a constant bound of for the price of uncertainty independent of G and B. Using a generalized version of the approximation algorithm we proposed in [3], it is indeed possible to construct a ˜c with the desired property for = 2.

The main idea of UncertainPresentBiasApprox is simple: First, the algorithm computes a value α such that α/b is a lower bound on the reward necessary for agents with present bias <sup>b</sup>, i.e., <sup>r</sup>(G, {b}) <sup>≥</sup> α/b. In particular, this bound implies sup{r(G, {β}) <sup>|</sup> <sup>β</sup> <sup>∈</sup> <sup>B</sup>} ≥ α/b. Next the algorithm constructs a ˜<sup>c</sup> such that a reward of 2α/b is sufficiently motivating for all <sup>β</sup> <sup>∈</sup> <sup>B</sup>, i.e., <sup>r</sup>(G, B) <sup>≤</sup> <sup>2</sup>α/b. As a result the price of uncertainty can be at most 2. In the following we try to convey the intuition behind the algorithm in more detail.

We begin with the computation of α. For this purpose let P be a path minimizing the maximum cost an agent with present bias b perceives on her way from <sup>s</sup> to <sup>t</sup>. We call <sup>P</sup> <sup>a</sup> *minmax path* and define <sup>α</sup> = max{db(e) <sup>|</sup> <sup>e</sup> <sup>∈</sup> <sup>P</sup>} to be the maximum perceived edge cost of P. Since cost configurations cannot decrease edge cost, it should be clear that α is a valid lower bound on the reward required for the present bias <sup>b</sup>, i.e., <sup>r</sup>(G, {b}) <sup>≥</sup> α/b.

We proceed with ˜c. The goal is to assign extra cost in such a way that any agent with a present bias <sup>β</sup> <sup>∈</sup> <sup>B</sup> traverses only two kinds of edges. The first kind of edges are those on <sup>P</sup>. It is instructive to note that each such edge (v, w) <sup>∈</sup> <sup>P</sup> is motivating for a reward of α/b if <sup>β</sup> <sup>≥</sup> <sup>b</sup>. The reason is that

$$d\_{\beta}(v, w) = \beta \left(\frac{c(v, w)}{\beta} + d(v, w)\right) \le \beta \left(\frac{c(v, w)}{b} + d(v, w)\right) = \beta \frac{d\_b(v, w)}{b} = \beta \frac{\alpha}{b}.$$

In particular, <sup>P</sup> is motivating for each present bias <sup>β</sup> <sup>∈</sup> <sup>B</sup>. The second kind of edges are on cheapest paths to t. To identify these edges, the algorithm assigns a distinct successor <sup>ς</sup>(v) to each node <sup>v</sup> <sup>∈</sup> <sup>V</sup> \ {t} such that (v, ς(v)) is the initial edge of a cheapest path from v to t. Since we assume t to be reachable from all other nodes of G at least one suitable successor must exist. By definition of ς, we know that P = v, ς(v), ς(ς(v)),...,t is a cheapest path from v to t. We call <sup>P</sup> the <sup>ς</sup>*-path* of <sup>v</sup> and <sup>T</sup> <sup>=</sup> {(v, ς(v)) <sup>|</sup> <sup>v</sup> <sup>∈</sup> <sup>V</sup> \ {t}} <sup>a</sup> *cheapest path tree*.

Remember that we try to keep agents on the edges of P and T. For this purpose, we assign an extra cost of ˜c(e)=2α/b + 1 to all other edges. This raises their perceived cost to at least 2α/b + 1; a price no agent is willing to pay for a perceived reward of β2α/b. However, since we have not assigned any extra cost to T so far, the perceived cost of edges in P and T is unaffected by the current ˜c. In particular, all edges of P are still motivating for a reward of α/b and any present bias <sup>β</sup> <sup>∈</sup> <sup>B</sup>. To keep agents from entering costly <sup>ς</sup>-paths P = v, ς(v), ς(ς(v)),...,t, we assign an extra cost to the out-edges (v, ς(v)) of P, i.e., <sup>v</sup> <sup>∈</sup> <sup>P</sup> but <sup>ς</sup>(v) <sup>∈</sup>/ <sup>P</sup>. The extra cost ˜c(v, ς(v)) is chosen to match the cost of a most expensive edge on P between v and the next intersection of P and P. It is easy to see that the resulting ˜c can no more than double the perceived cost of any edge in P, see the proof of Theorem 1 for a precise argument. Furthermore, the perceived cost of any out-edge (v, ς(v)) of P is either high enough to keep agents on P or they do not encounter edges exceeding the perceived cost of (v, ς(v)) until they reenter P. We conclude that a reward of 2α/b is sufficiently motivating, leading us to one of the central results of our work.

### **Theorem 1.** *The price of uncertainty is at most* 2*.*

It is interesting to note that UncertainPresentBiasApprox can be executed in polynomial time. Furthermore, in the proof of Theorem 1 we argue that α/b <sup>≤</sup> <sup>r</sup>(G, B) <sup>≤</sup> <sup>2</sup>α/b. As a result we have also found an efficient constant factor approximation of UPB-OPT.

**Corollary 2.** *UPB-OPT admits a polynomial time* 2*-approximation.*

### **4 Variable Present Bias**

So far we have considered agents with an unknown but fixed present bias. We now generalize this model to agents whose β may vary arbitrarily within B as they progress through G. It is convenient to think of β as a *present bias configuration*, i.e., an assignment of present bias values <sup>β</sup>(v) <sup>∈</sup> <sup>B</sup> to the nodes v of G. Whenever the agent reaches a node v, she acts according to the current present bias value β(v). We say that G is motivating with respect to a present bias configuration β if and only if the agent does not quit on a walk from s to t.

To illustrate the consequences of a variable present bias we revisit Alice and Bob's scenario once more. Recall that the agent in this scenario is either like Alice with a present bias of <sup>a</sup> = 1/<sup>2</sup> <sup>−</sup> <sup>ε</sup> or like Bob with a present bias of <sup>b</sup> = 1/2 + <sup>ε</sup>, i.e., <sup>B</sup> <sup>=</sup> {a, b}. But while she had to commit to one present bias before, she is now free to change between a and b. For instance, her present bias could be b at <sup>s</sup> and <sup>v</sup>B, but <sup>a</sup> otherwise, i.e., <sup>β</sup>(v) = <sup>b</sup> for <sup>v</sup> ∈ {s, vB} and <sup>β</sup>(v) = <sup>a</sup> for <sup>v</sup> <sup>∈</sup> <sup>V</sup> \ {s, vB}. In this case she walks along the same path Bob would take, i.e., s, vB, vAB, t. However, there is a subtle difference. At vAB the agent behaves like Alice and needs strictly more reward than Bob to remain motivated while traversing (vAB, t). Under closer examination, which we will not go into detail here, it is in fact easy to see that the variability of β makes our agent more expensive to motivate than any agent with a fixed present bias from B.

#### **4.1 Computational Consideration**

Let G be an arbitrary task graph and B a suitable present bias set. We want to construct a cost configuration ˜c that is motivating for all present bias configuration <sup>β</sup> <sup>∈</sup> <sup>B</sup><sup>V</sup> , but requires as little reward as possible. Using arguments similar to those of Sect. 3, the computational challenges of this task are readily apparent. In particular, the corresponding decision problem VARIABLE PRESENT BIAS (VPB) is equivalent to MCC whenever B only contains a single element.

**Definition 4 (VPB).** *Given a task graph* G*, present bias set* B *and reward* r > <sup>0</sup>*, decide whether a cost configuration* <sup>c</sup>˜ *motivating for all* <sup>β</sup> <sup>∈</sup> <sup>B</sup><sup>V</sup> *exists.*

Because MCC is NP-complete [3], it immediately follows that VPB is NP-hard. A proof that VPB ∈ NP can be found in the full version of this paper.

#### **Corollary 3.** *VPB is NP-complete.*

As it is NP-hard to find optimal cost configurations for general B, we turn to the optimization version of the problem. For this purpose let r(G, B<sup>V</sup> ) be the infimum over all rewards admitting a cost configuration ˜c motivating for all <sup>β</sup> <sup>∈</sup> <sup>B</sup><sup>V</sup> and define VPB-OPT as:

**Definition 5 (VPB-OPT).** *Given a task graph* G *and present bias set* B*, determine* r(G, B<sup>V</sup> )*.*

Interestingly, approximating VPB-OPT seems to be much harder than UPB-OPT. The reason why the 2-approximation for UPB-OPT, i.e., UncertainPresentBiasApprox, does not work anymore is simple. Recall that the cost configuration ˜c returned by the algorithm lets the agent take shortcuts along cheapest paths to t. To ensure that these shortcuts do not become too expensive, ˜c assigns extra cost to their initial edge. This way the perceived cost within a shortcut should not be greater than that for entering. As long as the present bias is fixed, this works fine. However, if the present bias can change, the agent may become more biased within a shortcut and require higher rewards to stay motivated. One way to fix this problem is to let the assigned extra cost depend on τ , i.e., the range of B. More precisely, we multiply the cost assigned in line 9 of Algorithm 1 by τ and change line 5 to assign a cost of ˜c(e) = (1 + τ )α/b + 1. As a result we obtain a new algorithm VariablePresentBiasApprox with an approximation ration of 1 + τ .

### **Theorem 2.** *VPB-OPT admits a polynomial time* (1 + τ )*-approximation.*

Although VariablePresentBiasApprox yields a good approximation for a moderately variable present bias, it does not provide constant approximation bounds like UncertainPresentBiasApprox. Surprisingly, a sophisticated reduction from VECTOR SCHEDULING (VS) [4], shows that VPB-OPT cannot have an efficient constant factor approximation unless ZPP = NP.

**Theorem 3.** *No polynomial time algorithm can approximate VPB-OPT within a constant factor* > 1*, unless* NP = ZPP*.*

### **4.2 Occasionally Unbiased Agents**

Although VPB is hard to solve in general, a curious special case consisting of all present bias sets <sup>B</sup> for which 1 <sup>∈</sup> <sup>B</sup> is not. Note that agents whose present bias varies within such a B becomes temporarily unbiased whenever 1 is drawn. For this reason we call these agents *occasionally unbiased*. A behavioral pattern unique to occasionally unbiased agents is that they may start to walk along a cheapest path at any point in time whenever their present bias becomes 1. As a result we can reduce VPB to a decision problem we call CRITICAL NODE SET (CNS) for occasionally unbiased agents.

**Definition 6 (CNS).** *Given a task graph* G*, present bias set* B *and reward* r*, decide the existence of a critical node set* W*.*

We consider a node set <sup>W</sup> *critical* if the following properties hold: (a) <sup>s</sup> <sup>∈</sup> <sup>W</sup>. (b) Each node <sup>v</sup> <sup>∈</sup> <sup>W</sup> has a path <sup>P</sup> to <sup>t</sup> that only uses nodes of <sup>W</sup>. (c) All edges <sup>e</sup> of <sup>P</sup> satisfy <sup>d</sup>b(e) <sup>≤</sup> br with <sup>b</sup> = min <sup>B</sup>. As it turns out, such a <sup>W</sup> contains exactly those nodes an occasionally unbiased agent may visit with respect to a motivating cost configuration. This allows us to reduce VPB to CNS.

**Proposition 3.** *If* <sup>1</sup> <sup>∈</sup> <sup>B</sup>*, then VPB has a solution if and only if CNS has one.*

**Algorithm 2.** DecideCriticalNodeSet <sup>δ</sup>(t) <sup>←</sup> 0; **foreach** <sup>v</sup> <sup>∈</sup> <sup>V</sup> \ {t} **in reverse topological order do** <sup>U</sup> ← {<sup>w</sup> <sup>|</sup> (v, w) <sup>∈</sup> <sup>E</sup> and <sup>c</sup>(v, w) + βδ(w) <sup>≤</sup> <sup>b</sup>}; **if** <sup>U</sup> <sup>=</sup> <sup>∅</sup> **then** <sup>δ</sup>(v) ← ∞; **else** <sup>δ</sup>(v) <sup>←</sup> min{c(v, w) + <sup>δ</sup>(w) <sup>|</sup> <sup>w</sup> <sup>∈</sup> <sup>U</sup>}; **if** <sup>δ</sup>(s) <sup>&</sup>lt; <sup>∞</sup> **then return** "yes" **else return** "no";

All that remains to show is that CNS is decidable in polynomial time. A straight forward approach to this simple algorithmic problem is DecideCriticalNodeSet. We therefore conclude that VPB is efficiently solvable for occasionally unbiased agents.

**Corollary 4.** *If* <sup>1</sup> <sup>∈</sup> <sup>B</sup>*, then VPB can be solved in polynomial time.*

#### **4.3 The Price of Variability**

To conclude our work, we take a step back from computational considerations and look at the implications of variability from a more general perspective. Our goal is to quantify the conceptual loss of efficiency incurred by going from a fixed and known present bias to an unpredictable and variable one. Similar to the price of uncertainty we define the *price of variability* as the following ratio.

**Definition 7 (Price of Variability).** *Given a task graph* G *and a present bias set* <sup>B</sup>*, the price of variability is defined as* <sup>r</sup>(G, B<sup>V</sup> )/ sup{r(G, {β}) <sup>|</sup> <sup>β</sup> <sup>∈</sup> <sup>B</sup>}*.*

It seems obvious that the price of variability depends closely on the structure of G and B. Nevertheless, we would like to find general bounds for the price of variability much like we did in Sect. 3 for the price of uncertainty. As a first step, it is instructive to note that the price of uncertainty is a natural lower bound for the price of variability. The reason for this is that each cost configuration that motivates an agent whose present bias varies arbitrarily in B must also motivate an agent whose present bias is a fixed value from B. Therefore it holds true that <sup>r</sup>(G, B<sup>V</sup> ) <sup>≥</sup> <sup>r</sup>(G, B), which immediately implies the stated bound. Sometimes this bound is tight. Consider for instance Alice and Bob's scenario. As we have shown in Sect. 3, it is possible to construct a cost configuration ˜c verifying a price of uncertainty of 1. Using similar arguments, it is easy to see that ˜c remains motivating if we allow the present bias to vary, implying an identical price of variability. However, for general instances of G and B this tight relation between the price of uncertainty and the price of variability is lost. In fact, we can show that unlike the price of uncertainty, which has a constant upper bound of 2, the price of variability may become arbitrarily large as the range of B increases.

**Proposition 4.** *There exists a family of task graphs and present bias sets for which the price of variability converges to* τ /2*.*

Although Proposition 4 implies that the price of variability can become substantially larger than the price of uncertainty, it should be noted that the task graph constructed in the proof of this proposition is close to a worst case scenario. In particular, we can show that the price of variability cannot exceed τ +1, which is roughly twice the value obtained by Proposition 4. To verify this upper bound, it is helpful to recall the proof of Theorem 2. In the process of establishing the approximation ratio of VariablePresentBiasApprox we have argued that the cost configuration ˜c returned by the algorithm motivates any agent with a present bias configuration <sup>β</sup> <sup>∈</sup> <sup>B</sup><sup>V</sup> for a reward of at most (<sup>τ</sup> +1)r(G, {min <sup>B</sup>}). Consequently, it holds true that <sup>r</sup>(G, B<sup>V</sup> ) <sup>≤</sup> (<sup>τ</sup> + 1)r(G, {min <sup>B</sup>}), implying that the price of variability cannot exceed τ + 1.

**Corollary 5.** *The price of variability is at most* τ + 1*.*

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.